Using serological diagnostics to characterize remaining high-incidence pockets of malaria in forest-fringe Cambodia

Background Over the last decades, the number of malaria cases has drastically reduced in Cambodia. As the overall prevalence of malaria in Cambodia declines, residual malaria transmission becomes increasingly fragmented over smaller remote regions. The aim of this study was to get an insight into the burden and epidemiological parameters of Plasmodium infections on the forest-fringe of Cambodia. Methods 950 participants were recruited in the province of Mondulkiri in Cambodia and followed up from 2018 to 2020. Whole-blood samples were processed for Plasmodium spp. identification by PCR as well as for a serological immunoassay. A risk factor analysis was conducted for Plasmodium vivax PCR-detected infections throughout the study, and for P. vivax seropositivity at baseline. To evaluate the predictive effect of seropositivity at baseline on subsequent PCR-positivity, an analysis of P. vivax infection-free survival time stratified by serological status at baseline was performed. Results Living inside the forest significantly increased the odds of P. vivax PCR-positivity by a factor of 18.3 (95% C.I. 7.7–43.5). Being a male adult was also a significant predictor of PCR-positivity. Similar risk profiles were identified for P. vivax seropositivity. The survival analysis showed that serological status at baseline significantly correlated with subsequent infection. Serology is most informative outside of the forest, where 94.0% (95% C.I. 90.7–97.4%) of seronegative individuals survived infection-free, compared to 32.4% (95% C.I.: 22.6–46.6%) of seropositive individuals. Conclusion This study justifies the need for serological diagnostic assays to target interventions in this region, particularly in demographic groups where a lot of risk heterogeneity persists, such as outside of the forest. Supplementary Information The online version contains supplementary material available at 10.1186/s12936-024-04859-5.


Background
Through the intensification of global malaria control efforts, significant progress has been made worldwide towards its elimination over the past decades.Most endemic countries of the Asia-Pacific region and the Americas have set the goal of malaria elimination by 2030 [1,2].However, in recent years progress has stalled in many countries.Outside of Africa, as endemicity declines, an ever-increasing proportion of malaria cases are caused by Plasmodium vivax as opposed to the historically predominant Plasmodium falciparum.Plasmodium vivax is the most geographically widespread Plasmodium parasite, causing substantial morbidity with an estimated 14.3 million clinical malaria cases per year [3], and it is now the primary cause of malaria outside of Africa [4].
Plasmodium vivax is an understudied parasite causing low-density, often asymptomatic, infections.Nonetheless it causes significant morbidity.Several biological characteristics enable it to evade routine malaria control strategies, mostly targeted at P. falciparum.It has the particularity that it remains dormant in the liver, where it escapes acute blood-stage treatment and causes subsequent relapses.These dormant parasites, called hypnozoites, are virtually undetectable, making relapses impossible to disentangle unambiguously from reinfections.These particularities have contributed to maintaining the parasite reservoir and sustaining onward transmission [5].
The changing landscape of malaria in Cambodia is characteristic of the Greater Mekong Subregion [6,7].Like its neighbouring countries, it has set the ambitious goal of malaria elimination before 2030 [7] and deployed "last mile" accelerating strategies to achieve this target [8,9].Specifically, Cambodia projects P. falciparum malaria elimination by 2023 and all species human-malaria elimination by 2025 [10].In Thailand, Vietnam and Cambodia, a steady decrease of reported malaria cases, as well as malaria mortality has been observed over the past decade.While the decline of P. falciparum has been continuous, dropping to less than 400 yearly cases in 2022, P. vivax has been shown to be more resilient to malaria-control efforts and decreasing more slowly [6,11].Plasmodium vivax has taken over as the primary parasite, accounting for at least three quarters of the malaria burden in Cambodia [7,12,13].Most strategies set by the National Malaria Control Programme target blood-stage parasites [14], hence the marked difference in effectiveness towards P. vivax elimination.However, radical cure, which targets the liver-stage of the parasite as well as blood stages, has been deployed as part of a national P. vivax programme since 2021 [15].
As the overall prevalence of malaria in Cambodia declines, the heterogeneity in residual malaria transmission increases [10,13,16].The national malaria burden becomes increasingly fragmented over smaller and more remote regions, such as in the high-incidence provinces Ratanakiri and Mondulkiri in the East of the country [7,12].The local economic activities in these provinces are primarily logging and agriculture, activities which are associated with increased malaria risk [13].Being a vector-borne disease, malaria is highly environmentdependent.The main local vectors, primarily Anopheles dirus and Anopheles minimus are usually found in and around the forest making this the primary transmission site from mosquitoes to humans [17].The remaining high-risk areas need to be identified and characterized to allow for the continued tailored anti-malaria efforts [6].
The aim of this study was to get an insight into the shifting burden of malaria in the forest-fringe of Cambodia using molecular diagnostics and novel serological exposure markers.This study helps support locally adapted, and environment-specific decision-making, as opposed to potentially less cost-effective, one-size-fits-all approaches, with the aim of malaria elimination in Cambodia [18][19][20].

Study design
Participants aged 2-80 who had resided in the study area for at least 3 months were recruited from 10 villages in the Kaev Seima district in the high-incidence province of Mondulkiri in Cambodia.Smaller villages were oversampled as described in [13].The recruited individuals were observed for a baseline visit between December 2018 and January 2019, 12 visits at approximately monthly intervals between January 2019 and March 2020, and an additional three visits at two-to three-month intervals between June and December 2020 (Additional file 1: Fig. S1).
Data was collected on participants' village of residence, household composition, age, and sex.On each visit, the participants' haemoglobin levels, rapid diagnostic test results, Plasmodium spp.PCR testing, and species-specific Plasmodium PCR testing, if the genus-specific PCR was positive, were collected.Serological information was processed for the baseline visit only.

Laboratory methods PCR
The PCR laboratory methods were as described by Sandfort et al. [13].The blood samples collected at the interview site were stored at 4 °C.They were then separated into plasma and cell pellet and frozen at − 20 °C.At Institut Pasteur of Cambodia in Phnom Penh, cell pellets were stored at − 20 °C and plasma at − 80 °C.Infections with Plasmodium spp.parasites were determined by real-time PCR.Then qPCR specific for P. falciparum, P. vivax, Plasmodium malariae, and Plasmodium ovale was performed.

Distribution of demographics factors
The cohort was described in terms of sex ("male", "female"), age (" ≤ 16 years", or "children", and " > 16" years, or "adults") and forest coverage of village of residence ("Inside forest", "Forest fringe", and "Outside forest").The villages were classified in terms of their forest coverage based on a land cover analysis that was done in [17].Villages where ≥ 50% of households had ≥ 10% forest cover in their vicinity as of 2018 were considered "Inside forest".Those with ≥ 30% of households having ≥ 5% forest cover in their vicinity were classified as "Forest fringe", and, finally, "Outside forest" otherwise.The rationale behind the definition of these risk factor variables comes from [13].

Time-trends of Plasmodium spp. PCR-prevalence stratified by major demographic factors
For each follow-up month, from baseline to the last extra visit, the prevalence of PCR-detected infections with any Plasmodium species was computed.The prevalence was first computed for all study participants, then stratified by sex, then by age, then by forest coverage.It was finally stratified three ways, by sex, age, and forest coverage.To visualize the dominance of some Plasmodium species over others, the prevalence time trend was stratified by Plasmodium species.All these time trends were presented as line plots, corresponding to the time-dependent point estimates of prevalence, with a shaded area corresponding to the 95% binomial confidence interval.

Plasmodium vivax PCR-and seropositivity at baseline
Due to the significant dominance of P. vivax over other Plasmodium species (Additional file 1: Figure S2, Table S1), all subsequent statistical modelling was performed on P. vivax PCR and seropositivity data exclusively.Individual serological information was available from the baseline visit in the form of continuous quantifications of antibody levels to a panel of P. vivax proteins.Using this data, a random forest classification of the study population into seropositive and seronegative as described in [21] was performed.This classification algorithm uses this antibody panel to estimate with 80% specificity and 80% sensitivity whether individuals were infected with P. vivax within the past nine months.The point-estimates of the prevalence of PCR-detected P. vivax infections, and the P. vivax seroprevalence in the cohort at baseline were calculated.

Risk factors for P. vivax PCR-and seropositivity
To evaluate the demographic factors associated with P. vivax PCR-positivity, a mixed-effects logistic regression model was fitted, including a linear time-trend, all demographic predictors, and pairwise and three-way interactions between the three demographic predictors.The model above included an individual-based random intercept to account for the longitudinal structure of the data.The variance of the random intercept was reported as well as the predictor estimates, standard errors, and p-values.This model was then narrowed down to include only informative risk factors in the final model.Given that it is impossible to disentangle reinfections from relapses or continued infections in the case of PCR-detected infections on consecutive months within an individual, sequential positive results per individual were not treated differently at this stage.Each infection is treated as a novel infection.For the risk factor analyses, the three extra visits were excluded from the longitudinal analyses.These visits took place during the first wave of the COVID-19 pandemic in spring 2020, as social distancing measures and restrictions to forest logging activities were put in place in the study area [9].
Then, to investigate the association between age, sex, and forest coverage with P. vivax seropositivity at baseline, a multivariate logistic regression model was fitted.The three demographic predictors, and the significant pairwise interactions were included.

Survival analysis of P. vivax infection-free survival time
To evaluate the predictive effect of seropositivity at baseline on subsequent PCR-positivity, an analysis of P. vivax infection-free survival time after baseline, stratified by serological status at baseline was performed.Kaplan-Meyer curves of infection-free survival over the study period, stratified by serological status at baseline were first drawn.Then the analysis was repeated only considering the subjects in the demographic groups found to be most at risk, and drew Kaplan-Meyer curves, stratified again by seropositivity at baseline.This analysis was used to determine whether seropositivity status was informative in isolating risk profiles beyond demographic groups.A Cox-proportional hazards model was computed for P. vivax infection, with seropositivity as a main predictor, adjusted by the demographic variables sex, age, and forest coverage.Significance of the association between seropositivity at baseline and the hazards of subsequent P. vivax infection was evaluated at significance level α = 0.05.

Subgroup analysis of the performance of serodiagnosis for detecting future infections
Finally, it was evaluated in which subgroups of the cohort seropositivity at baseline would be predictive for a future infection in the time frame of the cohort was performed.True positives (TP) were defined as individuals who were seropositive at baseline who have at least one PCR-detected infection over the study period, and true negatives (TN) as individuals who were seronegative at baseline and did not develop a PCR-detected infection throughout the study period.In each subgroup the positive predictive value (PPV) was presented, i.e., the proportion of individuals who were seropositive at baseline who did develop an infection (TP/all seropositives), and the negative predictive value (NPV), i.e., the proportion of individuals who were seronegative at baseline who did not develop an infection during the study period (TN/all seronegatives).The corresponding 95% binomial confidence intervals were presented.This analysis allows the exploration of how well the classification algorithm developed in [21] which establishes seropositivity based on probability of past infection, is also useful in predicting the risk of future infection in different subgroups.

Study participants and observations
A total of 950 participants from 391 households were observed for an average of 10.45 of the 16 planned visits, with 285 complete cases, and 85 participants lost to follow-up immediately after baseline.A total of 9928 samples were collected.Serological information, in the form of continuous quantifications of antibody levels to a panel of P. vivax proteins, was available for 934 subjects at the baseline visit (Table 1).
Figure 1b shows the distribution of age groups in the participants observed at baseline, stratified by sex.There was a slight overrepresentation of male participants in children until age 16, followed by an overrepresentation of female participants from age 16 onwards.In terms of forest coverage, participants living inside the forest, outside the forest, and on the forest fringe were approximately equally represented (Fig. 1a).

Time-trends of Plasmodium spp. PCR-prevalence stratified by major demographic factors
Over the study period, the prevalence of PCR-detected Plasmodium infections significantly decreased by almost half across the study population, from 14% (11-16%) to 7% (5-9%) (Additional file 1: Fig. S3a).The time trends of Plasmodium infection prevalence were visualized by a three-way stratification -by sex, age, and forest coverage.Figure 2 shows a clear trend, that the further away from the forest, the more male adults separate from the rest of the population in terms of P. vivax prevalence.

Distribution of P. vivax PCR-and seropositivity at baseline
The distribution of P. vivax PCR-positivity at baseline is presented in Table 2.
Of the 950 cohort subjects, 119 were PCR positive at baseline.The lowest prevalence was found in female participants outside the forest (1% in children and 2% in adults), and on the forest fringe (3% in children and 1% in adults).The highest prevalence of P. vivax was found in male adults inside the forest at 35%, preceded by male adults outside the forest at 32%, then male children inside the forest at 23% (Table 2a).Of the 934 subjects for which serological data was available, a total of 294 were found to be seropositive.The lowest seroprevalence was found in female children outside the forest at 8%, followed by female adults outside the forest at 10%, then female children on the forest fringe at 11%.The highest seroprevalence to P. vivax was found in male adults inside the forest at 79%, followed by male adults outside the forest at 59%, female adults inside the forest at 56%, and finally male adults on the forest fringe at 54% (Table 2b).

Predictors of P. vivax PCR-and seropositivity
A mixed-effects logistic regression model including all demographic predictors as well as pairwise and three-way interactions between them was computed and is shown in Additional file 1: Table S2.The most parsimonious and informative model is shown in Table 3a.This model shows us that there is a significant downward time-trend in P. vivax prevalence.This trend is not significantly different in different demographic group as shown in Additional file 1: Tables S3-S5.Living inside the forest, compared to outside the forest, significantly increased the odds of P. vivax PCR-positivity by a factor of 18.3 (95% C.I. 7.7-43.5).Being a male adult was also a significant predictor of PCR-positivity.
A simple logistic regression on the seropositivity data at baseline, using the same predictors as above, is shown in Table 3b.This model indicates that male individuals, adults, and people living inside the forest are significantly at risk of being seropositive for P. vivax.Additionally, there is a significant interaction between being male and being an adult.More risk factors were identified for seropositivity, given the higher statistical power in detecting seropositivity over a larger time window, than PCR-infections at a specific time.Therefore, these findings are in line with the risk factors identified for P. vivax PCR-positivity.

Plasmodium vivax-infection-free survival stratified by seropositivity at cross-section
To explore the relationship between seropositivity and subsequent risk of infection, a survival analysis of time until first infection after baseline was performed on the 934 subjects for which serological information was available.Figure 3a  This survival analysis shows that serology is particularly informative outside of the forest and on the forest fringe.A table for the number of individuals at risk at every time point, stratified by seropositivity status can be found in Additional file 1: Table S8.
The Cox-proportional hazards models shown in Table 4b and Table 4c show that when considering individuals outside the forest and on the forest fringe, after adjusting for the demographic factors age and sex, shown before to be significantly associated with P. vivax infection, seropositivity is still significantly associated with higher hazards of new infections.Outside the forest, the hazard of new infection is exp(1.627)= 5.09 (95% C.I. 2.04-12.72)times higher for seropositive individuals compared to seronegative individuals, and on the forest fringe it is exp(1.055)= 2.87 (95% C.I. 1.35-6.13)times higher.Table 4a shows that seropositivity is not significantly associated with higher hazards of new infections when considering individuals inside the forest.These results confirm that serology is most informative in the settings outside the forest and on the forest fringe.As shown in Additional file 1: Table S9, the models did not violate the proportional hazards assumption.

Comparative subgroup analysis of the performance of seropositivity for predicting future infections
In this analysis, the best PPV was found outside the forest vs. inside or on the forest fringe, in males vs. in female individuals, and in children vs. in adults.The best NPV was also found outside the forest vs. inside or on the forest fringe, but in female vs. in male individuals, and again in children vs. in adults (Table 5).
Outside the forest, 54.76% (95% C.I. 48.71-59.81%) of seropositive individuals developed an infection and 95.85% (95% C.I. 93.82-97.87%) of seronegative individuals remained infection free over the study period.This confirms that it is really outside the forest that using

Discussion
Using longitudinal data of PCR-detected P. vivax infection, in the Mondulkiri province of Cambodia, this study was able to describe the main demographic risk factors associated with P. vivax infection, and the association of seropositivity with infection-free survival.Male adults and individuals living inside the forest are most at risk for P. vivax infections, which is in line with the extensive research done in the Greater Mekong Subregion [9,13,[23][24][25][26][27].Additionally, it was shown that seropositivity is a useful tool for risk identification and a proxy measure of exposure useful to disentangle unexplained variation in some demographic risk groups.
This study reflects the changing of malaria in South-East Asia and the Greater Mekong Subregion.Over past decades, significant malaria control has been achieved in Cambodia and its neighbouring countries [6,7,13].Small-scale, geographically focused studies, provide important insights into the mechanisms of this elimination and provide evidence of specific epidemiological profiles and remaining pockets of risk [13,25,26,[28][29][30][31][32].Understanding these risk profiles is crucial to effectively provide targeted efforts for the successful elimination of resilient malaria in the Greater Mekong Subregion [6].The forest and forest-fringe in Cambodia is a rapidly evolving environment that reshapes the malaria risk landscape with it [17].This study provides specific insights into this changing epidemiology of malaria on the forest fringe.Over the study period, the prevalence of PCR-detected Plasmodium infections decreased by half, from 14% (95% C.I. 11-16%) to 7% (95% C.I. 5-9%).The 1010 infections detected by PCR were almost exclusively P. vivax infections (Additional file 1: Table S1).While P. falciparum has long been the predominant malaria species in this region, elimination efforts have been largely successful and yearly national case numbers have dropped significantly, making P. vivax the most common parasite in the region [7,12,13].Thus, the epidemiological results of this study are in line with the current epidemic landscape in Cambodia and the Greater Mekong Subregion at large.It should be noted here that considering the COVID-19 pandemic, some external factors such as changes in social behaviour as well as restrictions to forest-related activities towards the end of the study period were potentially not accounted for in the analysis and could have contributed to the visible decline in Plasmodium infections [9].However, according to the Mekong Malaria Elimination: epidemiology summaries 2019-2023, a general decline in cases and deaths in Cambodia continued past the 2020 COVID-19 pandemic [6].

Time until first P. vivax infection stratified by seropositivity
The stratified demographic risk factor analysis, via mixed effects logistic regression, showed that the highest malaria prevalence was found in male adults, and in individuals living inside the forest.Moreover, the three-way stratification, by sex, age and forest coverage showed that male adults separate from all other demographic groups the further away from the forest they live.This is characteristic of two different types of exposure within and outside of the forest: peri-domestic and occupational [33].Inside the forest, all demographic groups exhibit roughly the same prevalence, indicating infections occurring around the household.Outside of the forest patterns characteristic of occupational exposure were observed, where mostly adult males are affected, being the demographic group mostly concerned with forestrelated work activities such as logging [13].On the forest fringe, a probable mixture of both types of exposure was  observed.It is thus mostly male adults and people living inside the forest who are most at risk of exposure to P. vivax malaria.Through a logistic regression, similar risk profiles were identified for seropositivity at baseline as for PCR-positivity throughout the study.
A survival analysis showed that seropositivity at baseline was a determining factor for infection-free survival during the cohort for people living outside the forest and on the forest fringe.The most striking effect occurred in residents outside the forest and on the forest fringe.The effect was still significant after adjusting for age and sex.For residents of villages inside the forest, a difference in reinfection risk between seropositive and seronegative individuals was observed, but the effect was no longer significant after adjusting for age and sex.This is what could be expected from previous research done on serology as a diagnostic tool.It is precisely in low-prevalence settings, where much heterogeneity exists, that serology can help disentangle risk profiles [34,35].It must be noted that this result is a compound effect of relapses and reinfections.Indeed, the serological classification algorithm was specifically designed to detect individuals infected within the past 6-9 months [21].Therefore, some of the shorter infection-free survival time in seropositive individuals must be due to relapses from recent primary infections.Given that relapses and reinfections are virtually impossible to disentangle unambiguously [36], it is impossible to quantify what share of the effect come from relapses and what share to the stability of risk profiles over time.
Finally, a comparative subgroup analysis of the performance of seropositivity for predicting future infections was performed.It could be seen that in terms of positive predictive value and negative predictive value, seropositivity worked best at predicting future infections outside the forest, in children and in female participants, i.e., subgroups with lower prevalence.All in all, these results show that serological information is most helpful in singling-out individuals most at-risk of infection and who would most benefit from public health interventions or prophylactic treatment in settings with low background prevalence, such as outside the forest.
A few limitations caveat the conclusions that can be drawn from these results.First, due to the impossibility to disentangle relapses from reinfections, every PCRconfirmed case was treated as a novel infection in the longitudinal analyses.However, in this setting, a high proportion of P. vivax cases are relapses, which could artifactually strengthen the clustering effect.Furthermore, only one cross-sectional serological measurement was used in the analysis rather than longitudinal measurements.It remains to be studied how much information would be gained by utilizing more frequent serological data.Nonetheless, showing that baseline seropositivity measurements are already informative at explaining variations in exposure within demographic groups is encouraging for the use of serology as a riskidentification tool in future studies [37].
This study gives some evidence that in this region, malaria clusters in specific individuals consistently over time.This stability of risk profiles in contrast to spatially sporadic malaria, has the potential to facilitate targeted interventions.To take advantage of this however, it is necessary to properly characterize and identify the individuals who are persistently at-risk of infection.While demographic factors are a good first approach, serological markers can give a more refined indication of who these at-risk individuals are.It was shown here that serological information is a particularly valuable tool to absorb the unexplained remaining variance of risk present in demographic subgroups where a lot of heterogeneity persists.While it is unclear what share of this effect is due to relapses or to the persistence of risk profiles, this is of relatively little importance from a programmatic standpoint.In this region, the main targeted intervention being considered for the future is radical cure with primaquine or prophylactic treatment.It was shown here that using serological information adds significant value in predicting future infections, from relapses or reinfections, in low-prevalence settings, where, most likely, more risk heterogeneity persists.Future research could use mathematical modelling to study the impact of targeting these risk groups on the parasite reservoir and on ongoing transmission with a serological test-and-treat intervention.

Conclusion
The results of this study show that P. vivax malaria is not a fleeting phenomenon in the Mondulkiri province.The stability of risk profiles on the medium-term is evident.It is the same small group of individuals that are consistently affected by malaria infections, male adults and people living inside the forest.While this is a pattern that has been observed at larger spatial resolutions [33], it still holds when zooming into finer pockets of high incidence.While partitioning the population according to main demographic factors can be a rough indication of risk groups (e.g., male adults and people living inside the forest), significant amounts of variation are still present within subgroups with lower background prevalence.In these settings, serological assessments are a robust and consistent way of singling out individuals in need of targeted treatment.Therefore, this study justifies the need for serological diagnostic assays to target interventions in this region.

Fig. 1 aFig. 2
Fig. 1 a Geographical distribution of villages and corresponding participant number.b Demographic pyramid, by age and gender

Fig. 3
Fig. 3 Kaplan-Meyer estimates of P. vivax infection-free survival (lines) with 95% confidence interval (shaded area) stratified by seropositivity status at the cross-section for a the whole cohort, b individuals inside the forest, c individuals living on the forest fringe, and d. individuals living outside the forest

Table 1
Cohort description

Table 2
Distribution of P. vivax infection

Table 3
Parameter estimates forThe reference level is a female child outside the forest

Table 4
Cox-proportional hazards model of infection for individuals

Table 5
Positive predictive value (PPV) and negative predictive value (NPV) of seropositivity at baseline for predicting infections during the study periodThe PPV is defined as the proportion of all seropositive individuals at baseline who go on to have a PCR-detected infection during the study period.The NPV is defined as the proportion of all seronegative individuals at baseline who go on to remain infection-free for the whole study duration